Co-channel speech separation for robust automatic speech recognition: stability and efficiency
نویسندگان
چکیده
A signal-separation front-end based on adaptive decorrelation ltering (ADF) was integrated with an HMM based speaker independent continuous speech recognition system for co-channel speech recognition. The ADF is improved by addressing the adaptation gain for system stability and e ciency: an upper bound of adaptation rate is derived for system stability, and an accelerated sequence of adaptation gain is introduced for system e ciency. The system was evaluated under simulated room acoustic conditions with both time-invariant and time-varying channels. It is shown that the system signi cantly improved the signal-to-interference ratio and the recognition word accuracy, and that the combination of the derived upper bound for adaptation rate with the accelerated adaptation gain sequence achieved the best performance for system stability and e ciency.
منابع مشابه
Adaptive co-channel speech separation and recognition
An improved technique of co-channel speech separation, S-AADF/LMS, and its integration with automatic speech recognition is presented. The S-AADF/LMS technique is based on the algorithms of accelerated adaptive decorrelation filtering (AADF) and LMS noise cancellation, where a switching between the two algorithms is made depending upon the active/inactive status of the co-channel signal sources...
متن کاملImproving the performance of MFCC for Persian robust speech recognition
The Mel Frequency cepstral coefficients are the most widely used feature in speech recognition but they are very sensitive to noise. In this paper to achieve a satisfactorily performance in Automatic Speech Recognition (ASR) applications we introduce a noise robust new set of MFCC vector estimated through following steps. First, spectral mean normalization is a pre-processing which applies to t...
متن کاملA Database for Automatic Persian Speech Emotion Recognition: Collection, Processing and Evaluation
Abstract Recent developments in robotics automation have motivated researchers to improve the efficiency of interactive systems by making a natural man-machine interaction. Since speech is the most popular method of communication, recognizing human emotions from speech signal becomes a challenging research topic known as Speech Emotion Recognition (SER). In this study, we propose a Persian em...
متن کاملA Two-Channel Acoustic Front-End for Robust Automatic Speech Recognition in Noisy and Reverberant Environments
An acoustic front-end for robust automatic speech recognition in noisy and reverberant environments is proposed in this contribution. It comprises a blind source separation-based signal extraction scheme and only requires two microphone signals. The proposed front-end and its integration into the recognition system is analyzed and evaluated in noisy living room-like environments according to th...
متن کاملAn Information-Theoretic Discussion of Convolutional Bottleneck Features for Robust Speech Recognition
Convolutional Neural Networks (CNNs) have been shown their performance in speech recognition systems for extracting features, and also acoustic modeling. In addition, CNNs have been used for robust speech recognition and competitive results have been reported. Convolutive Bottleneck Network (CBN) is a kind of CNNs which has a bottleneck layer among its fully connected layers. The bottleneck fea...
متن کامل